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We compare the CPU cost of HMC to various implementations of the Hermitean variant of Liischer's local 
bosonic algorithm (LBA) for 2D massive QED with two flavours of Wilson fermions. We carefully scan a 3- 
dimensional parameter subspace and find flat behaviour around the optimum. The gain factor of the LBA, as 
compared to HMC, is slightly smaller for the Re- weighting method than for the Metropolis variants and estimated 
to about 3.3 for the plaquette and 1.9 for the the meson correlator. 



1. Schwinger model 

For our tests we select as a low-cost laboratory 
the massive 2-flavour Schwinger model (2D QED) 
(l||^. The lattice model is given by the Wilson 
plaquette and fermionic action. For details we 
refer to last year's proceedings We work with 
the Hermitean fermion matrix Q = cj^M scaled 
so that its eigenvalues are in [—1, 1]. 

2. Hybrid Monte Carlo 

To set the scale, wc simulate using a HMC 
code working also with the Hermitean matrix Q. 
The implementation includes optimization fea- 
tures like trajectory length set by n ■ At = 1 
and acceptance « 70%, and re-use of the CG 
solution in the trajectory via i'= (gain w 20%). 

3. Hermitean local bosonic algorithm 

Alternatively to HMC, M. Liischer proposed a 
local bosonic formulation [Q . The one main vari- 
ant we consider uses the fact that we are dealing 
with a Hermitean fermion matrix thus allowing 
to exactly rewrite the effective distribution 
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with P„(s) a polynomial of even degree n approx- 
imating - for real s S [e, 1] such that the correc- 
tion factor det(l - R) = det[Q'^Pn{Q'^)] ~ 1. Its 
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roots Zk {k — I . . .n) come in complex conjugate 
pairs and determine — -\- ii/f. (vk > 0). 
This leads to a totally bosonic representation 
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with n complex bosonic Dirac fields 0^ . We chose 
as approximation polynomials P„(s) the Cheby- 
shev polynomials proposed by Bunk et al.j|]. The 
convergence of P„(s) — > 1/s as n —^ oo is ex- 
ponential and uniform for s G [e, 1]- One up- 
date of the bosonic system consists of a trajec- 
tory of heat bath and over-relaxation steps for U 
1^ and (f) fields, possibly followed by a Metropolis 
appectance correction for det[l — i?]. In total, this 
introduces the parameters n, e and the number of 
reflections per heat bath into the algorithm. 

4. Exact algorithm 

The correction factor can be treated exactly 
with a Metropolis accept/reject step or Re- 
weighting 0] using a stochastic method as demon- 
strated for the non-Hermitean variant 

The Reweighting method computes a noisy es- 
timate for det[l — R] and includes it in the ob- 
servables . 

The Metropolis correction step uses an accep- 
tance probability P^{x) dependent on a Gaussian 
noise x which satisfies detailed balance when av- 
eraged over the noise. This can be achieved for- 
mally by 
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with B given by BB^ = Q'^PniQ'^)- In the Her- 
mitean case this is non-trivial to solve. The task 
becomes trivial, if we recall the factorized form 



n/2 



Zk){Q'^ ~ Zk) 



taking one factor from each c.c. pair. 

We remark that the Metropolis scheme in- 
cludes a further optimization possibility which 
we call Metropolis with adapted precision. We 
retain a valid algorithm if the inversion neces- 
sary for the Metropolis decision is first executed 
with very small precision (in our case 5 = 10~^) 
and repeated with standard inverter precision 
{5 = 10~^) only if the decision would else be 
unclear. As the CG can be restarted from the 
intermediate solution, this procedure could result 
in less work on the average. 

5. Numerical instabilities 

Evaluating a high order polynomial for a ma- 
trix faces the problem of loss of precision. In 
the non-Hermitean variant or Re-weighting case 
we are able to avoid this by using the Cheby- 
shev recursion formula for R. Unfortunately the 
Hermitean variant with Metropolis correction or 
the Polynomial Hybrid Monte-Carlo algorithm 
rely on the evaluation of partial products of root 
factors. 

In a forthcoming publication [ p^ , we compare 
various proposals for reordering the roots to mini- 
mize numerical instabilities. In this work, we use 
a fairly stable version, the so-called Bitreversal 
scheme, whenever no recursion is possible. 

Recently, one of us proposed a different solution 
. We rewrite the Metropolis acceptance to 
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with 1] given hy x — [Q'^Pn{Q'^)]^il- this point 
we suggest solving for the inverse of the square 
root of Q^PniQ^) directly. This can be accom- 
plished with an expansion in Gegenbauer poly- 
nomials converging exponentially with the same 
rate as CG. 



Figure 1. 

8x20 lattice, pion, eta, aO versus kappa, beta = 3.0 
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6. Observables k value 

We simulate on 8x20 lattices with a conserva- 
tive /3 = 3.0 and a run length of usually > lOOOr, 
calculating the plaquette and correlations of lo- 
cal meson operators (^'7^t5') (tt), (^'r^') (ag) , 
(^'7^5') ijj). We expect finite size effects to ap- 
pear in deviations from the approximately linear 
behaviour of the pion mass with k. As shown in 
Fig. ^, finite size effects are small for k < 0.24, re- 
sulting in a pion mass m^r = 0.629 and a physical 
ratio ^ = 0.807. 

nir, 

7. Cost comparison 

To summarize, we repeat the schemes included 
in our investigation. Besides Hybrid MC to set 
the scale, we compare LBA with Re-weighting, 
LBA with Metropolis , LBA with Metropolis 
using adapted precision, and LBA with Gegen- 
bauer inverter as described above. 

We point out that Re- weighting and Metropolis 
algorithms result in 2 different ensembles. There- 
fore it is no longer possible to compare CPU cost 

by 

C = • A^all Q ops 

*mcas 

but one has to use a measure based on the relative 
error 
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Figure 2. 

Cost for Plaquette (opt # relexions) — Metropolis 
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Table 1 

CPU cost minina 



Plaquette 



algorithm 
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cost 


gain 


HMC 








6.1 




Metro 1 


18 


0.02 
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1.9 


3.3 


Metro 2 


18 


0.02 
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2.1 


2.9 


Gegenbauer 


18 


0.01 


2 


2.0 


3.0 


Re-wcigliting 


24 


0.02 
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2.6 


2.4 


Pion Propagator 


algorithm 
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rcfi. 


cost 


gain 


HMC 








574 




Metro 1 


24 


0.01 


2 


501 


1.1 


Metro 2 


18 


0.04 


1 


320 


1.8 


Gegenbauer 


12 


0.04 


1 


304 


1.9 


Re-weighting 


18 


0.02 


4 


414 


1.4 



In these formulae A'meas signifies the number of 
measurements, iVaii q ops the total number of ma- 
trix multiplications for the whole run. 

To illustrate our search for optimal parameters, 
we depict in Fig. ^ the CPU cost in the n ~ e 
plane (number of reflections optimized) for one 
algorithm, namely plain Metropolis. The figure 
clearly shows that we obtain a fiat optimum. 

We further compare only the optimal parame- 
ter sets and their CPU cost in table ^ 



tuning of the LBA is fairly easy. The CPU cost 
can be lower than for HMC, but not by a large 
factor with present techniques. We want to stress 
that the gain factors for the plaquette and pion 
propagator differ with estimates from plaquette- 
like observables too optimistic. 

Technically, the number of refiections per heat- 
bath update is found to be an important opti- 
mization tool. We also demonstrate that using a 
Noisy Metropolis scheme to make the LBA exact 
is possible for the Hermitean case as well. The 
Gegenbauer inverter, which avoids instabilities in 
the evaluation of the polynomial, is shown to be 
competitive to CG in a first real simulation. 
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8. Conclusions 



We think that this study gives further clear ev- 
idence that the LBA is competitive to HMC. The 
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U(1) pure gauge model 16x32 
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